智能论文笔记

Explicit Boundary Guided Semi-Push-Pull Contrastive Learning for Better Anomaly Detection

Xincheng Yao , Chongyang Zhang , Ruoqi Li

分类：计算机视觉

2022-07-04

大多数异常检测算法主要集中于建模正常样品的分布并将异常视为异常值。但是，由于缺乏对异常的知识，该模型的判别性能可能不足。因此，应尽可能利用异常。但是，在训练过程中利用一些已知的异常情况可能会导致另一个问题，即模型可能会受到已知异常的偏见，并且未能概括地看不见异常。在本文中，我们旨在利用一些现有的异常情况，具有精心设计的明确指导的半孔学习策略，这可以增强可区分性，同时减轻由于已知异常不足引起的偏见问题。我们的模型基于两个核心设计：首先，找到一个明确的分离边界作为进一步的对比度学习的指导。具体而言，我们采用归一化流程来学习正常特征分布，然后找到一个明确的分离边界，靠近分布边缘。所获得的显式和紧凑的分离边界仅依赖于正常特征分布，因此可以减轻少数已知异常引起的偏置问题。其次，在显式分离边界的指导下学习更多的判别特征。开发了边界引导的半孔损耗，以将正常特征融合在一起，同时将异常特征推开以外的分离边界以外的边界区域。通过这种方式，我们的模型可以形成更明确，更歧视性的决策边界，以为已知和看不见的异常取得更好的结果，同时还保持高训练效率。对广泛使用的MVTECAD基准进行的广泛实验表明，该方法可实现新的最新结果，其性能为98.8％的图像级AUROC和99.4％的像素级AUROC。

translated by 谷歌翻译

Analogical Inference Enhanced Knowledge Graph Embedding

Yao Zhen , Zhang Wen , Chen Mingyang , Huang Yufeng , Yang Yi , Chen Huajun

分类：人工智能 | 自然语言处理

2023-01-03

Knowledge graph embedding (KGE), which maps entities and relations in a knowledge graph into continuous vector spaces, has achieved great success in predicting missing links in knowledge graphs. However, knowledge graphs often contain incomplete triples that are difficult to inductively infer by KGEs. To address this challenge, we resort to analogical inference and propose a novel and general self-supervised framework AnKGE to enhance KGE models with analogical inference capability. We propose an analogical object retriever that retrieves appropriate analogical objects from entity-level, relation-level, and triple-level. And in AnKGE, we train an analogy function for each level of analogical inference with the original element embedding from a well-trained KGE model as input, which outputs the analogical object embedding. In order to combine inductive inference capability from the original KGE model and analogical inference capability enhanced by AnKGE, we interpolate the analogy score with the base model score and introduce the adaptive weights in the score function for prediction. Through extensive experiments on FB15k-237 and WN18RR datasets, we show that AnKGE achieves competitive results on link prediction task and well performs analogical inference.

translated by 谷歌翻译

Rethinking the Video Sampling and Reasoning Strategies for Temporal Sentence Grounding

Jiahao Zhu , Daizong Liu , Pan Zhou , Xing Di , Yu Cheng , Song Yang , Wenzheng Xu , Zichuan Xu , Yao Wan , Lichao Sun

分类：计算机视觉

2023-01-02

Temporal sentence grounding (TSG) aims to identify the temporal boundary of a specific segment from an untrimmed video by a sentence query. All existing works first utilize a sparse sampling strategy to extract a fixed number of video frames and then conduct multi-modal interactions with query sentence for reasoning. However, we argue that these methods have overlooked two indispensable issues: 1) Boundary-bias: The annotated target segment generally refers to two specific frames as corresponding start and end timestamps. The video downsampling process may lose these two frames and take the adjacent irrelevant frames as new boundaries. 2) Reasoning-bias: Such incorrect new boundary frames also lead to the reasoning bias during frame-query interaction, reducing the generalization ability of model. To alleviate above limitations, in this paper, we propose a novel Siamese Sampling and Reasoning Network (SSRN) for TSG, which introduces a siamese sampling mechanism to generate additional contextual frames to enrich and refine the new boundaries. Specifically, a reasoning strategy is developed to learn the inter-relationship among these frames and generate soft labels on boundaries for more accurate frame-query reasoning. Such mechanism is also able to supplement the absent consecutive visual semantics to the sampled sparse frames for fine-grained activity understanding. Extensive experiments demonstrate the effectiveness of SSRN on three challenging datasets.

translated by 谷歌翻译

Invertible normalizing flow neural networks by JKO scheme

Chen Xu , Xiuyuan Cheng , Yao Xie

分类： (统计)机器学习 | 机器学习

2022-12-29

Normalizing flow is a class of deep generative models for efficient sampling and density estimation. In practice, the flow often appears as a chain of invertible neural network blocks; to facilitate training, existing works have regularized flow trajectories and designed special network architectures. The current paper develops a neural ODE flow network inspired by the Jordan-Kinderleherer-Otto (JKO) scheme, which allows efficient block-wise training of the residual blocks and avoids inner loops of score matching or variational learning. As the JKO scheme unfolds the dynamic of gradient flow, the proposed model naturally stacks residual network blocks one-by-one, reducing the memory load and difficulty of performing end-to-end training of deep flow networks. We also develop adaptive time reparameterization of the flow network with a progressive refinement of the trajectory in probability space, which improves the model training efficiency and accuracy in practice. Using numerical experiments with synthetic and real data, we show that the proposed JKO-iFlow model achieves similar or better performance in generating new samples compared with existing flow and diffusion models at a significantly reduced computational and memory cost.

translated by 谷歌翻译

Exploring Vision Transformers as Diffusion Learners

He Cao , Jianan Wang , Tianhe Ren , Xianbiao Qi , Yihao Chen , Yuan Yao , Lei Zhang

分类：计算机视觉

2022-12-28

Score-based diffusion models have captured widespread attention and funded fast progress of recent vision generative tasks. In this paper, we focus on diffusion model backbone which has been much neglected before. We systematically explore vision Transformers as diffusion learners for various generative tasks. With our improvements the performance of vanilla ViT-based backbone (IU-ViT) is boosted to be on par with traditional U-Net-based methods. We further provide a hypothesis on the implication of disentangling the generative backbone as an encoder-decoder structure and show proof-of-concept experiments verifying the effectiveness of a stronger encoder for generative tasks with ASymmetriC ENcoder Decoder (ASCEND). Our improvements achieve competitive results on CIFAR-10, CelebA, LSUN, CUB Bird and large-resolution text-to-image tasks. To the best of our knowledge, we are the first to successfully train a single diffusion model on text-to-image task beyond 64x64 resolution. We hope this will motivate people to rethink the modeling choices and the training pipelines for diffusion-based generative models.

translated by 谷歌翻译

Distribution Estimation of Contaminated Data via DNN-based MoM-GANs

Fang Xie , Lihu Xu , Qiuran Yao , Huiming Zhang

分类： (统计)机器学习 | 机器学习

2022-12-28

This paper studies the distribution estimation of contaminated data by the MoM-GAN method, which combines generative adversarial net (GAN) and median-of-mean (MoM) estimation. We use a deep neural network (DNN) with a ReLU activation function to model the generator and discriminator of the GAN. Theoretically, we derive a non-asymptotic error bound for the DNN-based MoM-GAN estimator measured by integral probability metrics with the $b$-smoothness H\"{o}lder class. The error bound decreases essentially as $n^{-b/p}\vee n^{-1/2}$, where $n$ and $p$ are the sample size and the dimension of input data. We give an algorithm for the MoM-GAN method and implement it through two real applications. The numerical results show that the MoM-GAN outperforms other competitive methods when dealing with contaminated data.

translated by 谷歌翻译

A Lightweight Reconstruction Network for Surface Defect Inspection

Chao Hu , Jian Yao , Weijie Wu , Weibin Qiu , Liqiang Zhu

分类：计算机视觉 | 机器学习

2022-12-25

Currently, most deep learning methods cannot solve the problem of scarcity of industrial product defect samples and significant differences in characteristics. This paper proposes an unsupervised defect detection algorithm based on a reconstruction network, which is realized using only a large number of easily obtained defect-free sample data. The network includes two parts: image reconstruction and surface defect area detection. The reconstruction network is designed through a fully convolutional autoencoder with a lightweight structure. Only a small number of normal samples are used for training so that the reconstruction network can be A defect-free reconstructed image is generated. A function combining structural loss and $\mathit{L}1$ loss is proposed as the loss function of the reconstruction network to solve the problem of poor detection of irregular texture surface defects. Further, the residual of the reconstructed image and the image to be tested is used as the possible region of the defect, and conventional image operations can realize the location of the fault. The unsupervised defect detection algorithm of the proposed reconstruction network is used on multiple defect image sample sets. Compared with other similar algorithms, the results show that the unsupervised defect detection algorithm of the reconstructed network has strong robustness and accuracy.

translated by 谷歌翻译

Stochastic Methods for AUC Optimization subject to AUC-based Fairness Constraints

Yao Yao , Qihang Lin , Tianbao Yang

分类：机器学习 | (统计)机器学习

2022-12-23

As machine learning being used increasingly in making high-stakes decisions, an arising challenge is to avoid unfair AI systems that lead to discriminatory decisions for protected population. A direct approach for obtaining a fair predictive model is to train the model through optimizing its prediction performance subject to fairness constraints, which achieves Pareto efficiency when trading off performance against fairness. Among various fairness metrics, the ones based on the area under the ROC curve (AUC) are emerging recently because they are threshold-agnostic and effective for unbalanced data. In this work, we formulate the training problem of a fairness-aware machine learning model as an AUC optimization problem subject to a class of AUC-based fairness constraints. This problem can be reformulated as a min-max optimization problem with min-max constraints, which we solve by stochastic first-order methods based on a new Bregman divergence designed for the special structure of the problem. We numerically demonstrate the effectiveness of our approach on real-world data under different fairness metrics.

translated by 谷歌翻译

Multi-Projection Fusion and Refinement Network for Salient Object Detection in 360° Omnidirectional Image

Runmin Cong , Ke Huang , Jianjun Lei , Yao Zhao , Qingming Huang , Sam Kwong

分类：计算机视觉

2022-12-23

Salient object detection (SOD) aims to determine the most visually attractive objects in an image. With the development of virtual reality technology, 360{\deg} omnidirectional image has been widely used, but the SOD task in 360{\deg} omnidirectional image is seldom studied due to its severe distortions and complex scenes. In this paper, we propose a Multi-Projection Fusion and Refinement Network (MPFR-Net) to detect the salient objects in 360{\deg} omnidirectional image. Different from the existing methods, the equirectangular projection image and four corresponding cube-unfolding images are embedded into the network simultaneously as inputs, where the cube-unfolding images not only provide supplementary information for equirectangular projection image, but also ensure the object integrity of the cube-map projection. In order to make full use of these two projection modes, a Dynamic Weighting Fusion (DWF) module is designed to adaptively integrate the features of different projections in a complementary and dynamic manner from the perspective of inter and intra features. Furthermore, in order to fully explore the way of interaction between encoder and decoder features, a Filtration and Refinement (FR) module is designed to suppress the redundant information between the feature itself and the feature. Experimental results on two omnidirectional datasets demonstrate that the proposed approach outperforms the state-of-the-art methods both qualitatively and quantitatively.

translated by 谷歌翻译

Understanding Postpartum Parents' Experiences via Two Digital Platforms

Xuewen Yao , Miriam Mikhelson , Megan Micheletti , Eunsol Choi , S Craig Watkins , Edison Thomaz , Kaya De Barbaro

分类：自然语言处理

2022-12-22

Digital platforms, including online forums and helplines, have emerged as avenues of support for caregivers suffering from postpartum mental health distress. Understanding support seekers' experiences as shared on these platforms could provide crucial insight into caregivers' needs during this vulnerable time. In the current work, we provide a descriptive analysis of the concerns, psychological states, and motivations shared by healthy and distressed postpartum support seekers on two digital platforms, a one-on-one digital helpline and a publicly available online forum. Using a combination of human annotations, dictionary models and unsupervised techniques, we find stark differences between the experiences of distressed and healthy mothers. Distressed mothers described interpersonal problems and a lack of support, with 8.60% - 14.56% reporting severe symptoms including suicidal ideation. In contrast, the majority of healthy mothers described childcare issues, such as questions about breastfeeding or sleeping, and reported no severe mental health concerns. Across the two digital platforms, we found that distressed mothers shared similar content. However, the patterns of speech and affect shared by distressed mothers differed between the helpline vs. the online forum, suggesting the design of these platforms may shape meaningful measures of their support-seeking experiences. Our results provide new insight into the experiences of caregivers suffering from postpartum mental health distress. We conclude by discussing methodological considerations for understanding content shared by support seekers and design considerations for the next generation of support tools for postpartum parents.

translated by 谷歌翻译